Mapping of Sequence Reads to the Reference Genomes    ◾    75

In the following, we will use “bwa aln” to perform the first step of the alignment. Run

“bwa aln” without any option on the command line to learn more about the usage and

options. If the quality of reads at the 3-end is low, we can use the “-q” option with this

command to specify a quality threshold for read trimming down to 35 bp. Run the follow-

ing commands while you are one step out of the “refgenome” and “data” directories:

bwa aln \

refgenome/GRCh38.p13_ref.fna \

data/SRR769545_1.fastq.gz \

> data/SRR769545_1.sai

bwa aln \

refgenome/GRCh38.p13_ref.fna \

data/SRR769545_2.fastq.gz \

> data/SRR769545_2.sai

Then, we can use “bwa sampe” to generate the SAM file for the alignments.

$ bwa sampe \

refgenome/GRCh38.p13_ref.fna \

data/SRR769545_?.sai \

data/SRR769545_?.fastq.gz \

> sam/SRR769545_aln.sam 2> sam/SRR769545_aln.log

2.3.2.2  Bowtie2

Bowtie2 is an aligner that uses BWT and FM-index as data structures for indexing the

reference genome. It is an ultrafast, memory-efficient short read aligner, and it allows

mapping millions of reads to a reference genome on a typical desktop computer. Bowtie2

is the next generation of the original Bowtie which requires the reads to have equal length

and it does not align reads with gaps. Bowtie2 was developed to overcome those limita-

tions. It performs read mapping in four steps: (i) extraction of seeds from the reads and

their reverse strands, (ii) using FM-index for exact ungapped alignment of the seeds, (iii)

sorting the alignments by scores and identifying the alignment position on the refer-

ence genome from the index, and (iv) extending seeds into full alignments using paral-

lel dynamic programming [16]. Bowtie2 can be installed on Linux with the following

commands:

git clone https://github.com/BenLangmead/bowtie2.git

cd bowtie; make

Then, you need to set Bowtie2 path so that you can run it from any directory by editing

“.bashrc” file from your home directory.

cd #HOME

vim .bashrc